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Abstract 
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1 Introduction 


Throughout the paper, H is a separable real Hilbert space with scalar product (• | •), associated norm 
II • II, and Borel u-algebra B. 

A large array of problems arising in Hilbertian nonlinear analysis are captured by the following 
simple formulation. 

Problem 1.1 Let A: H —2^^ be a set-valued maximally monotone operator, let G ]0, +oo[, and let 
B: H —^ H be a ??-cocoercive operator, i.e., 

(VxGH)(VyGH) (x - y | Bx - By) > i?||Bx - Byf, (1.1) 

such that 

F = {z G H I 0 G Az +Bz} 7 ^ 0. (1.2) 

The problem is to find a point in F. 

Instances of Problem 11.11 are found in areas such as evolution inclusions ||2l , optimization Il4l 
[3811511 . Nash equilibria |]7]|, image recovery IlSlITOUTSH . inverse problems signal processing 

II 2 TII . statistics Il25ll . machine learning |[26l, variational inequalities Il31[l52ll . mechanics I1401I4T1I . 
and structure design IfSOH . For instance, an important specialization of Problem 11.11 in the context 
of convex optimization is the following O Section 27.3]. 

Problem 1.2 Let f: FI —)• ]— oo,+(X)] be a proper lower semicontinuous convex function, let 
'd G ]0,+oo[, and let g: H —)• R be a differentiable convex function such that Vg is iJ^^-Lipschitz 
continuous on H. The problem is to 

minimize f(x) + g(x), (1.3) 

xGH 

under the assumption that F = Argmin(f + g) 7 ^ 0. 

A standard method to solve Problem 1 1.1 1 is the forward-backward algorithm llT4ll3^ [52ll . which 
constructs a sequence (x„)„gisj in H by iterating 

(Vn G N) Xn+I = JT,,,A(xn - 7 nBxn), where 0 < 7 ^ < 2'd. (1.4) 

Recent theoretical advances on deterministic versions of this algorithm can be found in Il 6 l [TlJ l 20 l 
[22l|. Let us also stress that a major motivation for studying the forward-backward algorithm is that it 
can be applied not only to Problem fl.llp er se, but also to systems of coupled monotone inclusions via 
product space reformulations im , to strongly monotone composite inclusions problems via duality 
arguments I115[[20ll . and to primal-dual composite problems via renorming in the primal-dual space 
Il20l[53]| . Thus, new developments on (II.4D lead to new algorithms for solving these problems. 

Our paper addresses the following stochastic version of (II.4D in which, at each iteration n, Un 
stands for a stochastic approximation to Bx^ and a„ stands for a stochastic perturbation modeling 
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the approximate implementation of the resolvent operator J^„a- Let (0, T, P) be the underlying 
probability space. An H-valued random variable is a measurable map x\ —)• (H,.8) and, for 

every p G [1,+(X)[, P;H) denotes the space of equivalence classes of H-valued random 

variable x such that ||x||P(iP < +oo. 

Algorithm 1.3 Consider the setting of Problem ll.il Let xq, (un)neN, and (an)nGN be random vari¬ 
ables in LP‘{VI,F, P; H), let (An)ngN be a sequence in ]0,1], and let (7n)neN be a sequence in ]0,2i?[. 
Set 


(Vn G N) Xn-\-l — Xfi + 'Jn’Un) “P Q-n Xn'j. (1.5) 

The first instances of the stochastic iteration (II.5D can be traced back to Il4^ in the context 
of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic 
approximations in the gradient method were then investigated in the Russian literature of the late 
1960s and early 1970s Il27l |2^ |29l l3^ |4^ . Stochastic gradient methods have also been used 

extensively in adaptive signal processing, in control, and in machine learning, e.g., ||3l[^[54]]. 
More generally, proximal stochastic gradient methods have been applied to various problems; see 
for instance Ol [43 SSj [55ll . 

The objective of the present paper is to provide an analysis of the stochastic forward-backward 
method in the context of Algorithm 11.31 Almost sure convergence of the iterates (xn)„gN to a 
solution to Problem 11.11 will be established under general conditions on the sequences {un)nm> 
{an)nm> (7n)nGN) ^ud {Xn)nm- In particular, a feature of our analysis is that it allows for relaxation 
parameters and it does not require that the proximal parameter sequence (7n)nGN be vanishing. 
Our proofs are based on properties of stochastic quasi-Fejer iterations IfT^ . for which we provide a 
novel convergence result. 

The organization of the paper is as follows. The notation is introduced in Section [2l Section [3 
provides an asymptotic principle which will be used in Section |4] to present the main result on 
the weak and strong convergence of the iterates of Algorithm 11.31 Finally, Section [3 deals with 
applications and proposes a stochastic primal-dual method. 


2 Notation 


Id denotes the identity operator on H and ^ and —> denote, respectively, weak and strong conver¬ 
gence. The sets of weak and strong sequential cluster points of a sequence (x„)„gN in H are denoted 
by 2 B(x„),,6n and 6(x„)„gN, respectively. 

Let A: H —^ 2*^ be a set-valued operator. The domain of A is domA = {xgH | Ax/0} 
and the graph of A is graA = {(x, u) G H x H | u G Ax}. The inverse A^^ of A is defined via the 
equivalences (V(x, u) G H^) x G A~^u 77 u G Ax. The resolvent of A is Ja = (Id + A)^^. If A 
is maximally monotone, then Ja is single-valued and firmly nonexpansive, with dom Ja = H. An 
operator A: H —^ 2*^ is demiregular at x G domA if, for every sequence (x„,Un)ngN in graA and 
every u G Ax such that x„ ^ x and u„ — > u, we have x„ —^ x Il2l ■ Let G be a real Hilbert space. We 
denote by (B (H, G) the space of bounded linear operators from H to G, and we set ‘B (H) = B (H, H). 
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The adjoint of L G 23 (H, G) is denoted by L*. For more details on convex analysis and monotone 
operator theory, see |[4]] . 

Let (n, P) denote the underlying probability space. The smallest u-algebra generated by a 
family <1> of random variables is denoted by cj(< 1>). Given a sequence {xn)nm of H-valued random 
variables, we denote by a sequence of sigma-algebras such that 

(Vn € N) C and a{xo, ..., x„) C X„ C X„+i. (2.1) 

Furthermore, we denote by £^{^) the set of sequences of [0, +oo[-valued random variables (Cn)nGN 
such that, for every n G N, is Xn,-measurable, and we define 


(Vp G ]0, +oo[) JT) 


(eOnCN G f+(^) I 


neN 


< +00 


p 



and 




(^n)neN G 


sup^ri < +00 P-a.s. 

neN 


( 2 . 2 ) 


(2.3) 


Equalities and inequalities involving random variables will always be understood to hold P-almost 
surely, although this will not always be expressly mentioned. Let be a sub sigma-algebra of £F, 
let X G L^{Cl,£F, P; H), and let y G P; H). Then y is the conditional expectation of x with 

respect to £ if (^E G £) xdP = ydP; in this case we write y = E(x | £). We have 

(Vx G L^(L!,.F,P;H)) ||E(x|^)|| ^ E(||x|| |<f). (2.4) 

In addition, L‘^{VL,F, P; H) is a Hilbert space and 


(Vx G L2(H,.F,P;H)) 


||E(x|f)f ^E(||xf If) 

(VugH) E((x I u)|f) = (E(x|f) I u). 


(2.5) 


Geometrically, if x G L?‘{VL,F, P; H), E(x | £) is the projection of x onto L^(H,f , P; H). For back¬ 
ground on probability in Hilbert spaces, see ll^ [37ll . 


3 An asymptotic principle 


In this section, we establish an asymptotic principle which will lay the foundation for the conver¬ 
gence analysis of our stochastic forward-backward algorithm. First, we need the following result. 

Proposition 3.1 Let P be a nonempty closed subset of H, let f: [0, +oo[ —)• [0,+(X)[ be a strictly 
increasing function such that 4>{t) = +oo, let {xn)nGft be a sequence of H-valued random 

variables, and let (X„)„eN be a sequence of sub-sigma-algebras of F such that 

(Vre G N) a(xo,... ,x„) C X,^ C Xn+i. (3.1) 
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Suppose that, for every z G F, there exist G (Xn(z))neN G (^n(z))neN G 

£]_(^) such that 

(Vn G N) E((/)(||Xn+l - z||) I Xn) + l?n(z) ^ (1 + Xn(z))0(||xn - z||) + T/„(z) P-a.s. (3.2) 

Then the following hold: 

(i) (Vz G F) [ EnGN^nW < +00 P-a-S-] 

(ii) (xn)„gN is bounded P-a.s. 

(iii) There exists G such that P(0) = 1 and, for every cj G and every z G F, (||xn(w) — z||)„gis} 
converges. 

(iv) Suppose that 2B(x„)„gN C F P-a.s. Then {xn)nm converges weakly P-a.s. to an F-valued random 
variable. 

(v) Suppose that 6(xn)n€N n F / 0 P-a.s. Then {xn)nm converges strongly P-a.s. to an F-valued 
random variable. 

(vi) Suppose that 6(xn)nGN / 0 P-a.s. and that 2n(x„)nGN C F P-a.s. Then (x„)nGN converges 
strongly P-a.s. to an F-valued random variable. 

Proof. This is IfT^ Proposition 2.3] in the case when (Vn G N) = a{xo, ... ,Xn)- However, the 
proof remains the same in the more general setting of (12.ID . □ 

The following result describes the asymptotic behavior of an abstract stochastic recursion in 
Hilbert spaces. 

Theorem 3.2 Let F be a nonempty closed subset of FI and let (An)nGN be a sequence in ]0, 1]. In 
addition, let {xn)nm) {tn)nm) (cn)nGNj cind {dn)nm be sequences in P; FI). Suppose that the 

following are satisfied: 

(a) IF = (X„)„gi^ is a sequence of sub-sigma-algebras of F such that (VnGN) cr{xo ,..., Xn) C c 
Xn+l- 

(h) (Vn G H) Xji-\-i — Xn “F ^n(fn -F Cn Xn)' 

w EnGN AnA/E(||c„P|X„) < -Foo and EncN \/A„E(||dnP|X,i) < -Foo. 

(d) For every z G F, there exist a sequence (s„(z)),jgN of F\-valued random variables, (0i,n(z))nGN G 
(02,n(z))nGN G £-^-{lF), (/ri,n(z))nGN G £f‘{lF), (/r2,n(z))nGN G £f^{lF), (ni^„(z))„gisj G 
£?f{^), and (n 2 ,n(z))nGN G fp(jr) such that {XnPl,n{^))n&N G £\i^), (An/U 2 ,n(z))nGN G 
£\{IF), {XnUl,n{^))nett G £_/ {iF), {XnU2,n{'Z.))neN G £_/ {IF), 

(Vn G N) E(||t,, - zp I Xri) + Qi,n{^) ^ (1 + Fii,n(z))E(p„(z) -F dnf I T,^) -F ni,,^(z), (3.3) 

and 

(Vn G N) E(p„(z)p|X,^) -F 02,n{^) ^ (1 -F // 2 ,n(z))||x„ - zp -F n 2 ,„(z). (3.4) 


5 





Then the following hold: 


(i) (Vz G F) [ < +00 EnGN-^^^ 2 , 71 ( 2 ) < +00 P-a.s. ]. 

(ii) XyngN -^n)E(||^n | ^n) +00 P-a.S. 

(iii) Suppose that 2B(xn)ngN C F P-a.s. Then {xn)nm converges weakly P-a.s. to an F-valued random 
variable. 

(iv) Suppose that ©(xn)nGN n F 7^ 0 P-a.s. Then (x„)„gM converges strongly P-a.s. to an F-valued 
random variable. 

(v) Suppose that &{xn)n& / 0 P-a.s. and that W{xn)n&N © F P-a.s. Then (xn)nGN converges 
strongly P-a.s. to an F-valued random variable. 

Proof. Let z G F. By (12.5D and (13.3D . 


(Vn G N) E(||t 






^ (1 + ^ \/F(||s„(z) + dnP I X„) + 


(3.5) 


On the other hand, according to the triangle inequality and (13.4D . 


(VnGN) v'E(pn(z) + dnW^ I Tn) ^ v'E( || (z) || 2 I Xn) + I ^n) 



(3.6) 


Furthermore, |(b)| yields 


(Vn G N) \\Xn+l - z|| ^ (1 - Xn)\\Xn “ z|| + Xn\\tn “ z|| + Xn\\Cn 


(3.7) 


Consequently, (13.5D and (13.6D lead to 


(Vn G N) E(||x„+1 - z|| I X„) ^ (1 - An)||Xn - z|| + AnE(||t„ - z|| I Xn) + AnE(||Cn|| | X„) 

^ (l + /37i(z))||Xn -z|| +Cn(z), 


(3.8) 


where 



(3.9) 


and 



f^l,n 

2 



(||c„|||X, 7 ). (3.10) 
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Now set 


/il(z) = sup//i,„(z). 

nGN 


In view of (13.3D and (13.4D , we have 

2 ^ ^ Pn(2) — ^ ^ '^ 72 /^ 1 , 71 ( 2 ) + ^ ^ '^ 71 /^ 2 , 71 ( 2 ) ^ ^ '^7i/^l,7i(2)/l2,7i(2) 

tiEN tiGN tiEN tiEN 

^ + ^^2 ^) ^n/^2,n(z) 

nSN nSN 

< +(X). 

In addition, since (I2.5D yields 


(VnGN) E(||cJ||X„) ^ VE(||cnP|XO, 

we derive from | (c) | and | (d) | that 

7^i(z) V' 


^\nyi,n{^) + f 1 + ^ j f Y^A„V 2 , 4 z) + VA„E (||||2 | X„) 

nCN nSN ' V„^rj 

+ ^A,,v'E(||c„||2|X,,) 


' tiEN 


tiEN 


tiEN 
< + 00 . 


Using Proposition 13.]]Ifii)l (I3.8D . (I3.12D . and (I3.14D . we obtain that 


Xn - Z 


'riGN 


is almost surely bounded. 


In turn, by (I3.4D . 


(E (11 Sn (z) I p I X„)) is almost surely bounded. 

In addition, (I3.3D implies that 

(Vn G N) Edit,, - zf I X„) ^ 2(1 + 7 Ii(z))(E(||s„(z)f | X„) + E(||d„f | X„)) + z.i,„(z), 

from which we deduce that 

(A„ E (11- z I p I X„)) is almost surely bounded. 

Next, we observe that (I3.3D and (I3.4D yield 

(Vn G N) E(||t„ - z|p I X„) + 6 'i,„(z) + (1 + /il,n(z)) 6 ' 2 ,n(z) 

^ (1 + /ii,„(z))(l + /X 2 ,n(z))||x„ - z|p + ni,„(z) 

+ (1 + /^l,n(z)) (l' 2 ,n(z) + 2E((s„(z) I dn) | X„) + E(||(in|P | X„)) . 


(3.11) 

(3.12) 

(3.13) 

(3.14) 

(3.15) 

(3.16) 

(3.17) 

(3.18) 

(3.19) 
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Now set 


Oniz) = Ol,niz) + (1 + ^l,n(z))6'2,n(z) 
fln{z) = /il,n(z) + (1 + 7Ii(z))^2,n(z) 

t'n(z) = Z^l,n(z) + (1 + 7Ii(z))(l/2,n(z) + 2y^E(||(z) ||^ | X„) y^E(||dn P | X„) + E(||d„ |p | X„)) 
'?n(z) — z|| ||C}^|| +2(1 A) 2 )||x^ z|| ||C)T,|| + A 7 ^||C)t,|| . 

(3.20) 


By the Cauchy-Schwarz inequality and (I3.19D . 

(Vn G N) E(||t„ - zf|X„) + 6'„(z) ^ (1 +/i„(z))||xn - z|p + i/„(z). (3.21) 

On the other hand, by the conditional Cauchy-Schwarz inequality, 


(Vn G N) AnE(^„(z) I Xn) ^ 2(1 - Xn)Xn\\Xn “ z|| E(||Cn|| | X„) 

+ 2A„v'AnE(||tn - Z||2 I Xn)V^nE{\\Cnf \ Xn) + A^Edlc,,^ | X,,) 

^ 2 \\Xn - z|| Xn^/E{\\Cn\\'^\Xn) 

+ 2v'A„E(||t„ - Z||2 I Xn)XnVmCnP \ OCn) + A^Edjc^f | X„). 

(3.22) 


Thus, it follows from (I3.15D , [(^ and (I3.18D that 

AnE(^„(z) I X„) < +00. 

neN 

Let us define 


(Vre G N) 


'dniz) = XnOn{z) + A„(l - A„)E(||t„ - X„|P | X„) 
< Xn{z) = Xnfin{z) 

,^n(z) — A)2E(^,2 (z) I X^) + XjiVniz). 


(3.23) 


(3.24) 


It follows from |(c)l |(d)l (13.161) . and the inclusion C f+(^) that (0„(z))„gi^ G f+(S'), 

{Xniin(z))n&\ G f+(^), and (Anr'n(z))neN G f+(^). Therefore, 

(^.(z))^^pjG4(jr) (3.25) 

and 


(Xn(z))„g^ G (3.26) 

Furthermore, we deduce from (I3.23D that 

(r?,,(z))^^j,Gf^(jr). (3.27) 
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Next, we derive from |(b)l ||4l Corollary 2.14], and (I3.21D that 


(Vn E N) Edlx^+I - z|p|X„) = E(||(l - Xn){Xn - z) + Xn{tn “ Z + Cn)f\Xn) 

= (1 - A„)E(||Xn - z|p I X„) + \nE{\\tn - Z + Cnf \ X„) 

-^n(l -^n)E(||tn Xn “h C,t,|| | X,^) 

= (1 - An)||x„ - z|p + AnE(||t„ - z|p I %n) 

+ 2A„E((t„ - Z I Cn) I X„) - An(l - An)E(||t„ - XniP | X„) 

- 2A„(1 - An)E((t„ - Xn I Cn) \ In) + A^E(||Cn|P | X„) 

= (1 - An)||x„ - z|p + A„E(||tn - zf I Xn) 

- An(l - An)E(||tn - XniP | Xn) + 2A^E((tn - Z | Cn) | Xn) 

+ 2An(l — An)E((Xn — Z | Cn) \ Xn) + AnE(||Cn|P | Xn) 

^ (1 - Xn)\\Xn - zf + AnE(||tn “ z|P | Xf 

- An(l - An)E(||tn “ Xnf \ Xf + AnE(^n(z) | Xn) 

^ (l + Xn(z)) ||Xn - z||^ - ■i9n(z) + ??n(z). (3.28) 


We therefore recover (13.2D with (j): t Hence, appealing to (I3.25D . (I3.26D . (I3.27D . and Propo- 
sition l3.1]iri)l we obtain (i?n(z))nGN C f+(=^), which establishes [(I)] and | (ii) [ Finally, [(iii)]](v) | follow 
from Proposition I3.1|iriv)jf(vi)l □ 


Remark 3.3 


(i) Theorem 13.21 extends IfT^ Theorem 2.5], which corresponds to the special case when, for 
every n E N and every z E F, /ii,n(z) = z^i,n(z) = (* 2 ,n(z) = 0 and dn = 0. Note that the 

assumptions in Theorem 13.21 are just made to unify the presentation with the forthcoming 
results of Section |4l However, since we take only conditional expectations of [0, +cx3 [-valued 
random variables, they are not necessary. 

(ii) Suppose that (Vn E N) c„ = = 0. Then (I3.20D and (I3.24D imply that 

(Vn E N) rjn{z) = Xn{l^l,n{^) + f+Tii{z))u 2 ,n{^)), (3.29) 

and it follows directly from (I3.28D and Proposition 13.11 that the conditions on (ni „(z))„gi^ and 
(i^i,n(z))neN can be weakened to (A„ni,„(z))„6N C ^+(^) and (A„n 2 ,n(z))nGN C ^+(^). 


4 A stochastic forward-backward algorithm 

We now state our the main result of the paper. 

Theorem 4.1 Consider the setting of Problem 11.11 let {rfneN be a sequence in [0,-F(X)[, let 3C = 
(X,i)nGN be a sequence of sub-sigma-algebras of T, and let {xfnm be a sequence generated by Algo¬ 
rithm \1.3\ Assume that the following are satisfied: 
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(4.1) 


(a) (VnGN) (t(xo, ... ,x„) C C Xn+i- 

(b) EnGN AnVE(||anP|X„) < +CX). 

(*^) XynGN II \ Xyj) || < + 00 . 

(d) For every z G F, there exists (Cn(z))^gj^ G such that (AnCn(z))„gi^ G and 

(Vn G N) E(||u„ - E(u„|Xn)|p|X„) ^ rn||Bx„ - Bz|p + Cn(z). 

(e) mf„gN7„ > 0, sup„gp^r„ < +cx), and sup„gN(H-Tn)7n < 2d. 

(f) Either inf„gN A„ > 0 or [ 7„ = 7, XlneN < +00, and A^ = +00 ]. 

Then the following hold for some F-valued random variable x: 


(i) SnGN-^^ll^^r* “ ^^IP < +°0 

(ii) EngN Anikn “ In^Xn - “ Tn^Xn) + 7nBz|p < +00 P-a.S. 

(iii) (xn)nGN converges weakly P-a.s. to x. 

(iv) Suppose that one of the following is satisfied: 

(g) A is demiregular at every z G F. 

(h) B is demiregular at every z G F. 

Then (x„)„gN converges strongly P-a.s. to x. 

Proof. Set 

(VtT G n) Rtt, — Id 7 nB, — Xfi atld t^i — (4.2) 

Then it follows from (II.5D that assumption | (b) | in Theorem l3.2l is satisfied with 

(Vn G N) Cn = an- (4.3) 

In addition, for every n G N, F = Fix El Proposition 25.1 (iv)] and we deduce from the 

firm nonexpansiveness of the operators (J..y^A)nGN El Corollary 23.8] that 

(VzGF)(VnGN) ||tn - z||^ + ||r„ - - RnZ + z|p ^ ||rn - R„z|p. (4.4) 

Now set 


(Vn G N) Uji — Un E(n, 2 1 X,^) T Bx^^. 


Then we derive from (14.4D that (13.3D holds with 


(Vz G F)(Vn G N) 


<• f.) 

^l,n(z) — E(||rji J.y^A^n RnZ T z|| | X,^) 

|Ul,n(z) = ni,„(z) = 0 
Sn(z) = Xn- TnUn “ RnZ 
dn — 7n(E(n^ I Xn,) BXn). 


(4.5) 


(4.6) 
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Thus, (I4.3D . (l4.6D , [(b)l|(c)l and |(e)l imply that assumption | (c) | in Theorem l3.2l is satisfied since 


^ VA„E(||d„|| 2 |X„) ^ 2 (Tn + x/AnllEKIX^-BxJP 

nGN nSN 

^ 2 ??^ y/X^\\E{Un\Xn) “ BXn|| 

nEN 

< +00. (4.7) 


Moreover, for every z E F and n E N, we derive from (I4.5D . (II.ID . and (I4.1D that 

E(||s„(z)||^|X„) = E{\\Xn - z --fn{Un - Bz)|p|X„) 

= \\Xn - zf - 2-fn{Xn “ Z | E{Un \ ^n) “ Bz) + 7nE(||Sn “ Bzf \ Xn) 

= \\Xn -zf - 2-fn{Xn “ Z | BXn “ Bz) + 7 ^(E(||tt„ - E(ti„ | X„) ||^ | X„) 

+ 2{Un - B{Un I X„) I BXn - Bz) + || BXn - Bz||^) 

= \\Xn - z||^ - 2'yn{Xn “ Z | BXn - Bz) 

+ 7n{^{\\Un - E{Un \ X^ f | X^ + ||Bx„ - Bzf) 

^ \\Xn - zf - 7n(2l? - 7n)||BXn - Bz||^ + 7^E(||'U„ - E(«„ | X„) ||^ | X„) 

^ \\Xn - zf - 7n(2l? - (1 +Tn)7n)||BXn - Bz|p + 7^Cn(z)• (4.8) 

Thus, (I3.4D is obtained by setting 


(Vn E N) 


('2,n(z) = 7n(2?? - {1 + Tn)jn)\\BXn “ Bz^ 
< /U2,n(z) = 0 
,l^2,n(z) = 7nCn(z). 


(4.9) 


Altogether, it follows from |(d)| and |(e)| that assumption |(d)| in Theorem 13.21 is also satisfied, 
applying Theorem 13.2|iri)[ we deduce from |(e)l (14.6D . and (I4.9D that 


By 


(Vz€F) An||BXn - Bz|p < +00 

nEN 


(4.10) 


and 


(Vz E F) ^ AnE(||r„ - - RnZ + zf | X„) < +CX). 

neN 


(4.11) 


(i) 


See (l4dB . 
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|(ii)t Let z E F. It follows from (I4.2D , (14.5D , (12.5D . and the nonexpansiveness of the operators 

('^ 7 'n.A)nGN that 

(Vre G N) \\Xn - 7nBXn - “ 7nBx„) + 7„Bz|p 

— II 'ynUn I ^n) 7'«BXj^) + 7nBz|| 

^ 3(||E(r„ - + 7nBz|Xn)f +7nl|E(Un|XH) - BXn||^ 

+ II B(J^^ A^n I ^n) '^7nA {^n 'Jn B^^n) || ) 

^ 3(E(||r„ - + 7nBz||^ I Xn) + jl^iWUn “ BXn|P | X„) 

+ E(|| J^^A^n || | ^n)) 

^ 3(E(||r„ - + 7 nBz|p | X„) + li^{\\un - Bx„|p | X„) 

+ E(||r„ - [xn -7„Bx„)|p|Xn)) 

= 3(E(||r„ - - RnZ + zf \ Xn) + 27 ^E(||tt„ - Bxnf I Xn)) 

^ 3(E(||r„ - J^^A^n - RnZ + z|p|Xn) +8??^E(||tt„ - Bxn|P|X„)). (4.12) 

However, by (I4.1D . 

(Vn G N) E(||u„ - Bxn|p|X„) ^ 2E(||ttn - E(u„|X„)||^ + ||E(tt„|X„) - Bxn||^|X„) 

^ 2(rn||Bxn- Bz|p + Cn + ||E(rt„|Xn) - Bxn||^). (4.13) 

Since sup^gjsj < +oo by |(e)l we therefore derive from [(I)j|(c)l and |(d)| that 

^ AnE(||u„ - Bxn|p|x„) < +00. (4.14) 

ngN 

Altogether, the claim follows from (14.IIP . (I4.12D . and (I4.14P . 

|(iii)fl(iv)^ Let z G F. We consider the two cases separately. 

• Suppose that inf„gpf An > 0. We derive from [(I)j|(ii)l and |(e)| that there exists H G X" such that 
P(H) = 1 , 

(Vw G H) Xn(uj) - Jj^AiXni^^) “ 7nBXn(w)) -)■ 0, (4.15) 

and 

(Vw G Q) Bxn(w) —> Bz. (4.16) 

Now set 

(Vn G hi) Un — J 7 nA(^n 7nBXn) and Vn — 7n {Xn Vn) BXn. (4.17) 

It follows from |(e)l (I4.15D . and (I4.16P that 

(Vw G H) yn(w) — Xn(w) —)• 0 and Xn(w) —> — Bz. (4.18) 

Let w G H. Assume that there exist x G H and a strictly increasing sequence {kn)nm in N such 
that Xk„{ui) X. Since Bxfc„(a;) —> Bz by (I4.16P and since B is maximally monotone ||4l 
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Example 20.28], Il4l Proposition 20.33(ii)] yields Bx = Bz. In addition, (I4.18D implies that 
2/A:„(w) ^ X and Vk^{uj) -Bz = -Bx. Since ( I4.17D entails that (yfc„(w),'Cfc„(w))neN lies 
in the graph of A, ||4l Proposition 20.33(ii)] asserts that — Bx G Ax, i.e., x G F. It therefore 
follows from Theorem l3.2|nii) |that 


i{lo) x{u}) 


(4.19) 


for every a; in some Q G such that c and P(n) = 1. We now turn to the strong 
convergence claims. To this end, take w G fl. First, suppose that (g) holds. Then A is 
demiregular at x{ui). In view of (I4.18D and (I4.19D . 2 /n(w) ^ x{uj). Furthermore, Vn{!^) —> 
— Bx(u;) and {yn{ui),Vn{!^))nm lies in the graph of A. Altogether yn{oj) —^ x{uj) and therefore 
Xri(a;) —)• x{u}). Next, suppose that |(h)| holds. Then, since (14.16ft yields Bx„(w) —)• Bx(a;), 
(14.191) implies that Xn(w) —x(w). 


• Suppose that < + 00 , Xn = + 00 , and (Vn g N) = 7- Let T = o (Id - yB). 

We deduce from [(!)] that 

(Vz G F) lim II Bx„ - Bz|| = 0 (4.20) 

and from I (ii) I that 

(Vz G F) lim ||xn - Txn - y(Bxn - Bz)|| = 0. (4.21) 

In view of |(e)l we obtain 

lim ||Tx„ — x„ II = 0. (4.22) 

In addition, since |(e)| and l|4l Proposition 4.33] imply that T is nonexpansive, we derive from 
(11.51) that 


(VnGN) ||TXn+i - Xn+i|| 

— ||T^n+l (1 Xn)Xn Xn^^'yA^Xn yitn) + rij^)!! 

— llT^Jfi-i-i TXfi (1 Xyi'){Xn Tx,i) Afi(J.yA(®n Afifl^|| 

^ llT^^n+i T^j^^ll “F (1 A,t,)||Tx,^ XfiW 

+ Afi||J^A(^n T^n) T^®n)|| “F Aj2||cifi|| 

^ ll^n+i 3^7111 “F (1 A,^)||TX)2 XfiW “F Aj^yllu^^ Bxj^H + A,^||q.,^|| 

— Aj 2 ||J^A(®n “F CLn Xn\\ “F (1 Afi)||Tx^ Xn\\ “F Aj 2 y||'Ufi BXfi|| -F An||On|| 

^ llT^^n XfiW “F AfiIIJ.yA 'JUn') J7A(2^n T^^n)|| “F A^yHu^, BXj^ll -F 2 A^||(Ijt,|| 

^ ||TXn - Xnll + 2An(y||«n “ Bx„|| + ||a„||). (4.23) 


Now set 


(VnGN) = y\/A„E(||nn - Bx^P |X„) + AnA/E(||anP | ^n)- 


(4.24) 


Using (I4.1P . we get 


^ yA/A„E(||n„ - E(n„ | Xn)P | X„) + yy^An||E(n„ | X„) - Bx^lp + A„A/E(||a„P | X„) 

^ y"x/An^n II BXfi BzII + y \/ XnCni,^^ “F 7"x/A^ || E('U,j I X,^) BXfi II 

+ A„A/E(||a„||2|X„). (4.25) 
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Thus, (I4.23D and (I2.4D yield 


(Vn G N) E(||Txn+i - Xn+i|| I X„) 

^ IITX/^ ^n|| “i“ 2A^I ^n) “1“ ^(ll^nll I 

^ IITXt^ 3^n|| “t“ 2^,2* 


In addition, according to the Cauchy-Schwarz inequality and[(I)j 


^ ^ \l^n'^n II 

neN 





^ ^ II ^Xn 

uGN 


Bz|P < + 00 . 


(4.26) 


(4.27) 


Thus, it follows from assumptions | (b) H (d) | that (Cn)neN € ^+(^), and we deduce from Propo¬ 
sition [ST^i^ and (I4.26D that (||Tx„, — Xn||)„gN converges almost surely. We then derive from 
(I4.22D that there exists Q G T such that P(0) = 1 and (I4.15D holds. Let u G Q. Suppose that 
there exist x G H and a strictly increasing sequence {kn)nm in N such that Xk^{uj) x. Since 
Xk„{u}) X and Txfc„(a;) — Xfc„(a;) —)■ 0, the demiclosedness principle |]4l Corollary 4.18] 


asserts that x G F. Hence, the weak convergence claim follows from Theorem 13.21(111)1 To 
establish the strong convergence claims, set w = z — yBz, and set (Vn G N) rcn = — yBx^. 

Then Txn = ■i'yA'Wn and z = Tz = J.yAW. Hence, appealing to the firm nonexpansiveness of 
we obtain 


(Vn G N) (Txn - Z I Xn - TXn - 7(BXn - Bz)) 

= (Tx„ - Z I Wn-TXn + Z-w) 

= {^yA'Wn - J7AW I (Id - ^yA)'Wn “ (Id - J^a)™) 

^ 0 (4.28) 


and therefore 


(Vn G N) (Txn - z I x„ - Txn) ^ 7 (Txn - z I Bxn - Bz). (4.29) 

Consequently, since T is nonexpansive and B satisfies (II.ID . 

(Vn G N) ||Xn - z|| ||TXn - Xnll ^ ||TXn - z|| ||TXn - Xn|| 

^ (TXn - Z I Xn - TXn) 

^ 7 (TXn - Z I Bx„ - Bz) 

= 7 ((Txn - Xn I Bxn - Bz) -F (x^ - z | Bx„ - Bz)) 

^ -7||TXn - Xnll ||Bxn - Bz|| -F 7?9||BXn - Bz|p 
^ -^||Tx„ - x„|| ||xn - z|| -F y-dllBxn - Bz|p (4.30) 

V 

and hence 

(Vn G N) ||Bxn - Bz|p < ^^1 -F ||x„ - z|| ||Txn - x„||. (4.31) 


Since, P-a.s., {xn)neN is bounded and Tx„ — x„ —)■ 0, we infer that Bx„ —^ Bz P-a.s. Thus there 
exists H G X” such that H c H, P(H) = 1, and 

(Vw G H) Xniuj) x{u) and Bxn(w) —Bx(a;). (4.32) 
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Thus, 1(h) |=> Xn{uj) x{uj). Finally, if |(g)| holds, the strong convergence of (xn(a;))„eN follows 
from the same arguments as in the previous case. 


□ 


Remark 4.2 The demiregularity property in Theorem I4.1liriv)l is satisfied by a wide class of op¬ 
erators, e.g., uniformly monotone operators or subdifferentials of proper lower semicontinuous 
uniformly convex functions; further examples are provided in Il2l Proposition 2.4]. 


Remark 4.3 To place our analysis in perspective, we comment on results of the literature that seem 
the most pertinently related to Theorem l4.ll 


(i) In the deterministic case. Theorem 14. lUThi) I can be found in llT4l Corollary 6.5]. 


(ii) In |[T1 Corollary 8], Problem [1.2l is considered in the special case when H = and solved via 
(II.5D . Almost sure convergence properties are established under the following assumptions: 
(7n)nGN IS a decreasing sequence in ]0,i?] such that = +oo. An = 1, On = 0, and the 

sequence (xn)neN is bounded a priori. 


(hi) In Il46]| . Problem 11.11 is addressed using Algorithm 11.31 The authors make the additional 
assumptions that 


(Vn E N) E(ttn I Xn) = Bxn and an = 0. (4.33) 

Furthermore they employ vanishing proximal parameters (7n)nGN- Almost sure convergence 
properties of the sequence (xn)neN are then established under the additional assumption that 
B is uniformly monotone. 


(iv) The recently posted paper Il47]l employs tools from IIT^ to investigate the convergence of a 
variant of (II.5D in which no errors (an)neN are allowed in the implementation of the resol¬ 
vents, and an inertial term is added, namely. 


(Vn E N) Xn-\-l — Xn -|- Afi “f PniXn ^n—l') 'Yn'^n') Xn^ , 

where pn E [0,1[. (4.34) 

In the case when pn = 0, assertions |(iii)| and |(iv)|Kh)| of Theorem 14.11 are obtained under 
the additional hypothesis that inf A„ > 0 and the stochastic approximations which can be 
performed are constrained by (I4.33D . 


Next, we provide a version of Theorem l3.2l in which a variant of (II.5D featuring approximations 
(A„)„gH of the operator A is used. In the deterministic forward-backward method, such approxi¬ 
mations were first used in Il3^ Proposition 3.2] (see also llT4l Proposition 6.7] for an alternative 
proof). 

Proposition 4.4 Consider the setting of Problem \1.1\ Let xo, (ttn)nGNj citid (an)neN be random vari¬ 
ables in P; H), let (A„)„gN be a sequence in ]0,1], let (7n)neN be a sequence in ]0,2'(9[, and let 

{/Kn)nm be a sequence of maximally monotone operators from H to 2^^. Set 

(Vn E N) Xn-\-l — Xn “f A^ (X)2 7n'^n) “f ®n ^n) ■ (4.35) 

Suppose that assumptions \(a)W({)\ in Theorem \4.1\ are satisfied, as well as the following: 
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(k) There exist sequences (an)neN and (/3n)nGN in [0,+oo[ such that < +oo, 

XyngN ^ “l“OC), UUd 

(Vn G N)(Vx G H) ||Jt,„a„x - J^„ax|| ^ an||x|| + ^n- (4.36) 

Then the conclusions of Theorem I4.il remain valid. 

Proof. Let z G F. We have 

(VtT, G N) ||X) 2 -|_i z|| ^ (1 A) 2 )||x^ z|| + A^ll J.y^An (®)i 'ynUn') z|| + An,||On||' (4.37) 

In addition, 

(Vn G N) ||J^„A„(a:n - InUn) - z|| 

^ Whr.huiXn - InUn) “ J 7 „A„(z - 7nBz)|| + ||J 7 „A„(z - 7nBz) - J^„a(z - 7nBz)|| 

^ \\Xn - InUn “ Z + 7nBz|| + ||J 7 „A„(z - 7nBz) - J 7 „a(z - 7nBz)|| 

^ \\Xn Z 7n(BXn Bz) 7n(’*^n B('U ,2 | X,^)) || -|- 7n || E('U^ | X^,) || 

+ l|J 7 nA„(z - 7 „Bz) - J 7 „a(z - 7 nBz)||. (4.38) 

On the other hand, using assumptions | (d) | and | (e) | in Theorem 14. II as well as (II.ID . we obtain as in 

(EH) 

(Vn G N) E(||x„ - z - 7 n(Bxn - Bz) - 7„(nn - £(«„ | X„))|p | X„) 

^ \\Xn - zf - 7n(2l? - (1 + rn)7n)||BXn - Bzf + 7^Cn(z) 

^ \\Xn - z|p + 7nCn(z), (4.39) 

which implies that 

(Vn G N) E(||x„ - z - 7 n(Bx„ - Bz) - 7„(nn - E(nn | X„))|| | X„) 

^ \\Xn - z|| + 7n\/Cn(z). (4.40) 

Combining (I4.37D . (I4.38D . and (I4.40D yields 

(VnGN) E(||x„+i-z|||X,,) 

^ \\Xn ^11 “F ^nhn Cn(z) T AjT,7n || E(nfi | X,^) Bx,^ || 

+ An||J 7 „A„(z - 7nBz) - Jt,„a(z - 7nBz)|| + AnE(||an|| | X„) 

^ 11 Xn Z11 + 7n \/ AnCn(z) T "x/ An 11 E (n^ | X,j ) Bx^ 11 

+ An||J 7 „A„(z - 7nBz) - J 7 „a(z - 7nBz)|| + A^ x/E( || P | X„). (4.41) 

Since |E Proposition 4.33] asserts that 

the operators (Id — 7 „B)„gp^ are nonexpansive, (4.42) 

it follows from I (k) I that 

(Vn G N) An||J 7 „A„(z - 7nBz) - J.y„A(z - 7nBz)|| ^ ■\/X^an\\z - 7nBz|| + XnPn 

^ x/^ctnllzll + An/3n. (4.43) 
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Thus, 


Y1 '^^ll-’ 7 nA„(z - 7 „Bz) - Jt,„a(z - 7nBz)|| < +00. (4.44) 

ttGN 


In view of assumptions |(a)H(e)| in Theorem 14.11 and (I4.44D . we deduce from (I4.41D and Proposi¬ 
tion [3T^^ that (xn)neN is almost surely bounded. In turn, (14.42ft asserts that {xn — 7nBx„)„gN is 
likewise. Now set 


(Vn G N) Qn — -^ 7 tiA„ ’^n'^n) J 7 ,iA(®n 'yn'^n) T (4.45) 

Then (14.351) can be rewritten as 

(Vn G N) Xn-\-i — Xfi + 'Jn’^n) T On • (4.46) 

However, 

(Vn E N) A/E(||anP|X„) ^ Y^E(||J^^A„(a;n - InUn) - i-^r,^{Xn - InUnW I ^n) 

+ \/E(||a„||2|X„). (4.47) 

On the other hand, according to |(k)l assumption | (d) | in Theorem 14.1[ and (I4.42D , 

(VtT/ G N) Xn 'sj E( II J'yyj^An i^n '^7nA (^n |P | ^n) 

^ '\/W^n || H“ /^n)^ | ^ri) 

^ ll^^n 'Jn || Tir ll'^n || -|- Pn')‘^ \ ^n) 

^ XnO^n (ll^n 'Jn || -|- '^n '\/E(||'U^ | ~\~ XnPn 

^ XnOl-n (ll^n Tir|| H“ Tir || | BXtt, || 

+ ln^/^{\\Un - E(w^|X^)P|X^)) + 

^ Xn(y.n (ll^n In || “1“ Tir || \ ^n) || + 'Y'n.'x/^^H Bx-^, Bz|| 

'Jn's/ Cn( 2 )) H“ Xn/3n 

^ \/(IITjtBXjt,|| + '7n'\/|| ^(^ 7 ^ | OCn) 6 X 77 , || + ' 7 ti\/^^|| 6 X 77 Bz|| 

+ Jn's/ XnCni'Z^)) + XnPn- (4.48) 

However, assumptions | (c) | and | (d) | in Theorem 14. 1 1 guarantee that (v^A^||E(w 77 1X 77 ) — Bx^ilD^er^ and 
(\/AnCn(z))nGN P-a.s. bounded. Since {Bxn)nGN and (x^ — 7776 x 77 ) 77 ^^ are likewise, it follows 
from I (k) I and (I4.42D that 

^ ^ Xn “sj B( II ■J7nAn {p^n || ^ | ^n) H“ 00 , (4.49) 

riGN 

and consequently that 

y; XnVHWanW^l^n) < + 00 . (4.50) 

neN 

Applying Theorem 14.II to algorithm (14.461) then yields the claims. □ 
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5 Applications 


As discussed in the Introduction, the forward-backward algorithm is quite versatile and it can be 
applied in various forms. Many standard applications of Theorem l4.1 1 can of course be recovered for 
specific choices of A and B, in particular Problem ll.2[ Using the product space framework of If2ll . it 
can also be applied to solve systems of coupled monotone inclusions. On the other hand, using the 
approach proposed in llT5ll20ll . it can be used to solve strongly monotone composite inclusions (in 
particular, strongly convex composite minimization problems), say, 

<? 

find X E H such that z E Ax + ^ Lfc((BfcnDA.)(LfcX-rfc))+px, (5.1) 

k=l 

since their dual problems assume the general form of Problem 11.11 and the primal solution can 
trivially be recovered from any dual solution. In (15.ID . z E H, p E ]0,-|-oo[ and, for every k E 
{1,..., g}, Yk lies in a real Hilbert space G^, —)• 2^'= is maximally monotone, : G^ —^ 2*^* is 

maximally monotone and strongly monotone, B^, □ Dfc = (B^^ + D^^) 1, andUE® (H, Gfc). In such 
instances the forward-backward algorithm actually yields a primal-dual method which produces a 
sequence converging to the primal solution (see It20l Section 5] for details). Now suppose that, in 
addition, C: H —> H is cocoercive. As in IfTTlI . consider the primal problem 

find X E H such that z E Ax + ^ Lfc((BfcnDfc)(LfcX-rfc))+Cx, (5.2) 

k=l 


together with the dual problem 

find vi E Gi, ..., Vg E Gg such that 

(VA:E g}) - E -^(A + C)-i [z - Lfv;] + B, V + V- (5.3) 

^ i=i ^ 

Using renorming techniques in the primal-dual space going back to Il3^ in the context of finite¬ 
dimensional minimization problems, the primal-dual problem (I5.2D - (I5.3D can be reduced to an 
instance of Problem 11.11 Il20l |5^ (see also Il23]| ) and therefore solved via Theorem 14.11 Next, 
we explicitly illustrate an application of this approach in the special case when (I5.2D - (I5.3D is a 
minimization problem. 


5.1 A stochastic primal-dual minimization method 

We denote by ro(H) the class of proper lower semicontinuous convex functions. The Moreau subd¬ 
ifferential of f E ro(H) is the maximally monotone operator 

5f: H ^ 2^: x^ {u E H I (Vy E H) (y - x | u) + f(x) ^ f(y)}. (5.4) 

The inf-convolution of f: H ^ ]— oo,+(X)] and h: H — )■ ]—oo,+oo] is defined as fDh: H —> 
[—oo,+oo] : X !-)• infygH (f(y) + h(x — y)). The conjugate of a function f E ro(H) is the function 
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f* E ro(H) defined by (Vu E H) f*(u) = supxgH((^ I u) — f(x)). Let U be a strongly positive self- 
adjoint operator in 23 (H). The proximity operator off E ro(H) relative to the metric induced by U 
is 


proXf^ : H ^ H : X —^ argmin 

yGH 




(5.5) 


where 

(Vx E H) ||x||u = \/(x I Ux). (5.6) 

We have prox^ = Jy-iaf- 

We apply Theorem 14. II to derive a stochastic version of a primal-dual optimization algorithm for 
solving a multivariate optimization problem which was first proposed in II171 Section 4]. 


Problem 5.1 Let f E ro(H), let h: H —)■ M be convex and differentiable with a Lipschitz-continuous 
gradient, and let (7 be a strictly positive integer. For every k E {1,...let be a separable 
Hilbert space, let gk E ro(Gfc), let E ro(Gfc) be strongly convex, and let L^ E 23(H,Gfc). Let 
G = Gi 0 • • • © Gg be the direct Hilbert sum of Gi,..., Gg, and suppose that there exists x E H such 
that 

<? 

0 E (9f(x) + ^Lfc((9gfcnajfc)(Lfcx) 0 Vh(x). (5.7) 

k=l 

Let F be the set of solutions to the problem 

<? 

minimize f(x) + y^(gk □Jfc)(Lfcx) © h(x) (5.8) 

xGH ^^ 

k=l 

and let F* be the set of solutions to the dual problem 


minimize (f* □ h* 

vGG 


k=l 


^L^Vfcj 0 (gfc(Vfc) +jfc(Vfc)); 


(5.9) 


k=l 


where we denote by v = (vi,..., Vg) a generic point in G. The problem is to find a point in F x F*. 


We address the case when only stochastic approximations of the gradients of h and (J^)i<gfcs;g 
and approximations of the functions f are available to solve Problem |5.11 

Algorithm 5.2 Consider the setting of Problem [5H] and let W E 23 (H) be strongly positive and self- 
adjoint. Let (fn)nGN be a sequence in ro(H), let (A„)„eN be a sequence in ]0,1] such that XlnsN = 
0(X), and, for every k E {1,...let E 23 (G^) be strongly positive and self-adjoint. Let xq, 
{un)neN, and {bn)neN be random variables in P; FI), and let vq, (sn)nGN, and (c„)„gN be 
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random variables in L^(0, T, P; G). Iterate 
forn = 0,1,... 

Vn = Prox)^ ^ ^^Vk,n + ^ + K 

Xn-\-l — Xn ~\~ ^n{yn ^n) (5.10) 

for A: = 1,..., g 

y —1 

'^k,n {yk,n “1“ ^ki^-ki^Vn Xn) Sk,ri)^ C-k^n 

^fc,n+l '^k,n ^niwk^n '^k,n)- 

Proposition 5.3 Consider the setting of Problem 15. 1 1 let X = (X„)„eN he a sequence of sub-sigma- 
algebras of T, and let {xn)nm cind (i>ri)ngN be sequences generated by Ahorithm \5.2\ Let /i G ]0, +oo[ 
be a Lipschitz constant of the gradient ofho and, for every k C {I,... ,q}, let Vk G ]0, +oo[ he a 
Lipschitz constant of the gradient o/j^ o Assume that the following hold: 

(a) (VtT/GM) aiXrfit C *dCn C 

EneN AnVE(||6„P|X„) < +CX) and XlneN An^E(||c„||2 |Xn) < +oo. 

W X]„gi>^\/A^||E(Mn|X:n) - Vh(Xn)|| < +00. 

(d) For every kc g}, EngN V^I|E(sfc,n | X^) - Vj^(vfc,„)|| < +oo. 

(e) There exists a summable sequence {Tn)nm bn [0, +oo[ such that, for every (x, v) G F x F*^ there 

exists (Cn(x,v))^gj^ G such that (A„Cn(x, G and 

(Vn G N) E(||u„ - E(u„|X„)f|X„) + E(||a„ - EK|X„)f |X„) 

^rJ||Vh(xO-Vh(x)||2 + ^||Vj^KO-Vj^(vfc)||2^ +Cn(x,v). (5.11) 

^ k=l ' 

(f) There exist sequences (a„)ngN and {/3n)nm in [0,+oo[ such that Z]ngNV^«n < +oo, 
XyngN An/3n ^ +CX), and 

(Vn G N)(Vx G H) ||prox^ \ — prox)^ ^ a„||x|| + /3„. (5.12) 

(g) max{/r, vi,..., nj < 2 ^1 - ^ELi l|Ufc'^^LfcWV 2 || 2 ^_ 

Then, the following hold for some F-valued random variable x and some f*-valued random variable v: 

(i) (3:n)ngN converges weakly P-a.s. to x and {vn)nm converges weakly almost surely to v. 

(ii) Suppose that Vh is demiregular at every x G F. Then {xn)nm converges strongly almost surely to 

X. 
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(iii) Suppose that there exists k E {1,..., g} such that, for every v E F*, Vj^ is demiregular at v^. 

Then {vk,n)n&N converges strongly almost surely to Vk- 
Proof. The proof relies on the ability to employ a constant proximal parameter in algorithm (I4.35D . 
Let us define K = H©G, g:G —> ]—oo,+oo] : v i-)- Ylk=i^k{,'^k), j: G —)• ]—oo,+oo] : v 
T,k=i 'ik(yk), L: H ^ G: X and U: G ^ G: v (Uivi,..., UgVg). Let us now 

introduce the set-valued operator 

A: K ^ 2*^: (x,v) ^ (af(x) + L*v) x ( - Lx + ag*(v)), (5.13) 

the single-valued operator 

B: K ^ K: (x, v) ^ (Vh(x), Vj*(v)), (5.14) 

and the bounded linear operator 

V: K ^ K: (x, v) ^ (W~^x - L*v, -Lx + U~^v). (5.15) 

Further, set 


and 

(VnEN) ?„ = IIV-^ ||V||r,,. (5.17) 

Since 1(e)] imposes that X^neN assume without loss of generality that 

sup 7^ < 2d — 1. (5.18) 

ngN 

In the renormed space (K, || • ||v), is maximally monotone and \/~^B is cocoercive 11201 

Lemma 3.7] with cocoercivity constant d Il43l Lemma 4.3]. In addition, finding a zero of the sum 
of these operators is equivalent to finding a point in F x F*, and algorithm (I4.35D with 7n = 1 for 
solving this monotone inclusion problem specializes to (I5.10tt (see Il20l 14311 for details), which can 
thus be rewritten as 

(Vn E N) (X) 2 -|_i, 77,2-1-1) — (^nj T '^n) ^ (Unj ^n)) T 0 ,n '^n)) ; ( 5 - 19 ) 

where 


^ ||U^^^LfcWV2||2 I 






(5.16) 


k=l 



(VnEN) an = {bn,Cn) (5.20) 

and 

(Vn E N) A ,2 : K 2*^ : (x, v) (c)fn(x) + L*v) x ( - Lx + i9g*(v)). (5.21) 

Then 

(Vn E N)(V(x,v) E K) Jv-ia^(x,v) = (^y,prox^* ' (v + UL(2y - x))) , 

where y = prox^ \x —WL*v). (5.22) 
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Assumption | (b) | is equivalent to E(||a„||y | X„) < +oo, and assumptions Kc)| and |(d)| imply 

that 


^ x/a;||E(V-H7x„,s„)|X„)-V-^BK,s„)||v <+oo. (5.23) 

ngN 

For every (x ,v) E F X F*, assumption Ke)| yields 

(Vn E N) E(||V-Hn„,aO - E(V-H7x„, | X„)||2 | 

^ ||V“i II (E(||u„ - E{un I X„)f I X„) + E(||s„ - E(a„ | | X^) 

^ l|V~i(r4||Vh(x„) - Vh(x)||2 + ||Vr(u„) - Vr(v)f) +Cn(x,v)) 

^ Tn||V"^B(Xn,r^n) “ \/~^B(x, v) ||v + Cn(x,v), (5.24) 

where 

(VnEN) Cn(x,v) = ||V-i||Cn(x,v). (5.25) 

According to assumption |(e)l (Cn(x, v ))neN ^ and (AnCn(x,v))^gpj E Now, let 

n E N, let (x, v) E K, and set y = prox^ ^(x — WL*v). By (I5.22tt and the nonexpansiveness of 
proXg* in (G, || • ||u-i), we obtain 

l|Jv-lA„(x>'') - Jv-1a(x>v)IIv 

^ l|V|| (l|y - yf + ||prox^*''(v + UL(2y - x)) - prox^*’'(v + UL(2y - x))|p) 

^||V||(||y-yf+ 4||UL(y-y)||2_0 

^||V||(l + 4||U||||Lf)||y-y||2. (5.26) 

It follows from I (f) I that 

l|Jv-iA„(>^>v) - Jv-1a(x,v)||v 

^ ||V||V2||(1 + 2 ||Uf/2||L||)||proxW-i(^ _ wL*v) - prox^’^x - WL*v)|| 

^ ||V||V2||(i + 2||Uf/2||L||)(a„||x - WL*v|| + /3J 
^ ||V||V2||(i + 2||U||V2||L||)(a4||x|| + ||WL*||||v||) ^ 

^5„||(x,v)||v + /3n, (5.27) 


where 


fa. = v^||V||V2||(i + 2||Ur/2||L||)max{l, ||WL*||}|| V~i 
\/3. = ||Vr/2||(i + 2||Ur/2||L||)/3.. 

Thus, < +00 and X].gj^A. 3 n < +oo. Finally, since 7. = 1, (I5.18D implies that 

sup.gpj(l + 77 ) 7 . < 2d. All the assumptions of Proposition 14.41 are therefore satisfied for algorithm 
( 15 ^ . □ 
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Remark 5.4 


(i) Algorithm [ 530 ] can be viewed as a stochastic version of the primal-dual algorithm investigated 
in II 20 I Example 6.4] when the metric is fixed in the latter. Particular cases of such fixed metric 
primal-algorithm can be found in IfTSl [T6l l30l [34l [3511 . 

(ii) The same type of primal-dual algorithm is investigated in 0 in a different context since 
in those papers the stochastic nature of the algorithms stems from the random activation of 
blocks of variables. 


5.2 Example 

We illustrate an implementation of Algorithm l5.2l in a simple scenario with H = by constructing 
an example in which the gradient approximation conditions are fulfilled. 

For every k & {1,... ,q} and every n € N, set Sk,n = VJ^(vfc,n) and suppose that (yn)neN is almost 
surely bounded. This assumption is satisfied, in particular, if dom f and (fen)neN are bounded. In 
addition, let 

(Vn G N) (^Kn'T Zji'')o^n'<mni i^n '1 1 (6.29) 

where is a strictly increasing sequence in N such that = 0(n^+^) with 6 G ]0, +oo[, 

{Kn)neN is a sequence of independent and identically distributed (i.i.d.) random matrices of 
and (2;n)„gN is a sequence of i.i.d. random vectors of R^. For example, in signal recovery, (Ar„)„gN 
may model a stochastic degradation operators IfT^ , while (z„)neN are observations related to an un¬ 
known signal that we want to estimate. The variables {Kn,Zn)neN are supposed to be independent 
of (bn, Cn)neN and such that E||A'o||^ < +00 and E||zo||"^ < + 00 . Set 

(VxgH) U{x) = ^E\\Kox-zof (5.30) 

and, for every n G N, let 

^ rrin+i-l 

Un — ^ ^ K /(^K^/Xn ^n') (5.31) 

ITln+l ^ 
n'=0 

be an empirical estimate of Vh(xn). We assume that A„ = 0 {n~'^) where k G ]1 — <5,1] n [0,1]. We 
have 

(Vre G N) E(ri,^ |X„) - Vh(x„) = — - — {Qo,m„Xn - ro^mn) (5.32) 

rrin+i 

where, for every (ni, 77 - 2 ) G such that ni < n 2 , 

n2 —1 712 — 1 

(3ni,n2 = K^n'-E(iTo^iTo)) and rn,,n 2 = ^ {Kj^Zn'-E{K;I zo)). (5.33) 

n'=ni n'=ni 


23 








From the law of iterated logarithm Il24l Section 25.8], we have almost surely 

^ \\Qo,mJ - ^ 11^0’"^" 11 - < + 00 . (5.34) 

V log (log (rUn ) ) n^+oo ^rUn log (log (mn ) ) 

Since (yn)neN is assumed to be bounded, there exists a [0, +oo[-valued random variable rj such that, 
for every n € N, sup„gj^ \\yn\\ ^ y. Therefore, 


(Vn G N) ||xn|| ^ ||xo|| + ??. 

Altogether, (I5.32D - (I5.35D yield 

A„||E(tt„|3C„)-Vh(xn)f = o( 


Anmnlog(log(mn))^j _ ^/log(log(n)) 


m: 


71+1 


n 


1+5+/€ 


Consequently assumption | (c) | in Proposition 15.31 holds. In addition, for every re G N, 

Un ~ E(re,2 I “Xn) = {Qmn,mn+iXn ~ ) 

rrin+i 

which, by the triangle inequality, implies that 


1 


E(||re„-E(re„|XOr+„) ^ E((||g^„, 


rre; 


m„+i II ll-^n 


XnW + y 


'^n + l 




71+1 




m: 


2 (E||Qm„,m„+il| ll^Jnll +E||rm„,m„+il 

n+l 


Upon invoking the i.i.d. assumptions, we obtain 

|2 


E||gmn,mn+lll — {iXl-n+l ^n)E||Arg Kq E(iCQ Kq) 

E||r,„„,m„+ilP = {run+i -mn)E\\Kj zq - E{KJ zo)|P 


(Vre G N) 

and it therefore follows from (I5.35D that 

Cn = E(||re„ - E(re„|X„)|+X„) = 


and 


^nCn — 


o 


1 


re 


2+5+/^ 


Thus, assumption | (e) | in Proposition 15.31 holds with = 0. 


(5.35) 


(5.36) 


(5.37) 


(5.38) 


(5.39) 


(5.40) 


(5.41) 
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